Cost Requests
Request Specific Cost Analysis:โ
Note: Administrator privileges are necessary to access the cost management modules on the platform.
The Request-specific cost analysis feature provides a versatile tool for in-depth examination of costs across different dimensions, including project, user, and model perspectives. This functionality allows users to conduct detailed project-based cost analyses, gaining insights into resource consumption specific to each project. Additionally, it facilitates user-based cost analysis, enabling the assessment of individual user contributions to overall expenses. Furthermore, the feature extends its utility to model-based analysis, allowing users to scrutinize the cost implications associated with each deployed model.
These requests are generated whenever you make a request from various functionalities such as chat, extract, summarize, generate, classify, embeddings, tuning-studio, and multimodal. With this comprehensive approach, stakeholders can make informed decisions regarding resource allocation, budgeting, and optimization strategies.
To access the Request specific cost analysis dashboard, follow these steps:
Login to Katonic Generative AI Platform:
Log in to your Katonic Generative AI platform account using your credentials.
Navigate to the Admin Section:
Once logged in, click on the 'Admin' section in the platform's interface.
Select Cost Insights Board:
Within the Admin section, locate and select the 'Requests' board.
Request specific Dashboard:โ
The following details will be available in the request specific cost dashboard.
Flexible Time Analysis: Effortlessly analyze requests over various time periods by applying intuitive filters.
Comprehensive Request Logging: Every request made across the platform is meticulously logged, providing a detailed overview of each interaction.
Each logged request includes the following details:
Created at Timestamp: Records the time when the request was created.
Status of the Request: Indicates whether the request was successful or encountered issues.
Request: Represents the input provided for the LLM.
Response: Displays the output generated by the model.
Model: Specifies the name of the LLM model used for the request.
Total Tokens: Reflects the total number of tokens utilized throughout the entire request.
Prompt Tokens: Identifies the number of input tokens used for the request.
Completion Tokens: Quantifies the tokens employed for the output generated by the model.
Latency: Measures the response time for the model to generate the output.
Type: Specifies the project type in which the request occurred (e.g., Chatbot, Extraction, Summarize, Generate, or Multimodal).
Username: Records the name of the user who initiated the request.
Cost: Indicates the total cost incurred for processing the request.